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Abstract 


1 .  Introduction  and  Motivation 


Logistic  regression  is  the  most  used  form  of  binary  regression  (see  Berkson, 
1951;  Cox,  1970;  Efron,  1975;  Pregibon,  1981).  Independent  observations  (y^,x^) 
are  observed  where  (x^)  are  fixed  p-vector  predictors  and  (y^)  are  Bernoulli 
variates  with 

Pr{y.  "  II**)  "  F(xTb  )  -  (1  ♦  exp(-x?B  ))  1  •  (1.1) 

11  1  o  1  o 


Subject  to  regularity  conditions,  the  large  sample  distribution  of  the  maximum 
likelihood  estimator  of  Bq  is  approximately  normal  with  mean  zero  and  covariance 
matrix  (l/n)S  *(8  ),  where  S  (•)  is  defined  for  y  6  RP  as 


„  ,  .  -1  "  ( 1 ) ,  T  .  T 

Sn(r)  -  n  2  F  (x1r)x1xi 


(1.2) 


Motivation  for  our  paper  comes  from  the  Framingham  Heart  Study  (Gordon  and 
Kannel  (1968)),  a  prospective  study  of  the  development  of  cardiovascular  dis¬ 
ease.  This  ongoing  investigation  has  had  an  important  impact  on  the  epidemiology 

i 

of  heart  disease.  Much  of  the  analysis  is  based  on  the  logistic  regression 
model  with  y  an  indicator  of  heart  disease  and  x  a  vector  of  baseline  risk 
factors  such  as  systolic  blood  pressure,  serum  cholesterol,  smoking,  etc.  It  is 
well-known  that  many  of  these  baseline  predictors  are  measured  with  substantial 
error,  e.g.  systolic  blood  pressure.  When  a  person's  "true"  blood  pressure  is 
defined  as  a  long-term  average  then  individual  readings  are  subject  to  temporal 
as  well  as  reader-machine  variability.  In  one  group  of  45-54  year  old  Framing¬ 
ham  males  it  was  estimated  that  one  fourth  of  the  observed  variability  in  blood 
pressure  readings  was  due  to  within  subject  variability.  The  second  author  was 
asked  by  some  Framingham  investigators  to  assess  the  impact  of  such  substantial 
measurement  error  and  to  suggest  alternatives  to  usual  logistic  regression  which 
account  for  this  error.  The  present  study  is  an  out  growth  of  these  questions. 


When  covarletee  are  aeasured  with  error  the  usual  logistic  regression  esti¬ 
mator  of  0  is  asymptotically  biased;  see  Clark  (1982)  and  Mlchalik  and 
o 

Trlpathi  (1980).  As  a  consequence  of  bias  there  is  generally  a  tendency  to 
underestimate  the  disease  probability  for  high  risk  cases  and  overestiaate  for 
low  risk;  it  will  be  said  that  measurement  error  attenuates  predicted  prob¬ 
abilities.  A'!  so,  bias  creates  a  problem  with  hypothesis  testing;  in  Section  2 
it  is  shown  that  the  usual  asymptotic  tests  for  individual  regression  components 
can  have  level  higher  than  expected.  An  example  of  this  occurs  in  an  unbalanced 
two-group  analysis  of  covariance  where  interest  lies  in  testing  for  treatment 
effect  but  the  covariable  is  measured  with  error. 

The  severity  of  these  problems  depends,  of  course,  on  the  aagnltude  of  the 
measurement  error.  In  some  situations  ordinary  logistic  regression  might  per¬ 
form  satisfactorily.  However,  when  measurement  error  is  substantial,  alter¬ 
native  procedures  are  necessary.  In  addition,  the  availability  of  techniques 
which  correct  for  measurement  error  can  make  clear  the  need  for  better  measure¬ 
ment,  e.g.,  more  blood  pressure  readings  over  a  period  of  days. 

/ 

In  Section  2  our  measurement  error  model  is  defined  and  the  asymptotic  bias 
in  the  usual  logistic  regression  estimator  is  studied.  Section  3  presents  some 
alternative  estimators;  results  of  a  Monte  Carlo  study  are  outlined  in  Section 
4;  proofs  of  the  asymptotic  results  are  given  in  Section  5. 

Until  recently  the  study  of  measurement  error  models  has  focused  primarily 
on  linear  models;  see  the  review  article  by  Madansky  (1959)  and  the  papers  by 
Fuller  (1980)  and  Gleser  (1981).  Interest  in  nonlinear  models  is  increasing 
with  recent  contributions  by  Prentice,  1982;  Wolter  and  Fuller,  1982a  and  1982b; 
Carroll,  Spiegelman,  Lan,  Bailey,  and  Abbott,  1984;  Armstrong,  1984;  Amemiya, 
1982;  and  Clark,  1982.  Of  these  articles  Clark  (1982)  and  Carroll  et.  al. 

(1984)  focus  specifically  on  logistic  regression.  The  asymptotic  methods 
employed  in  this  paper  are  similar  to  those  used  by  Wolter  and  Fuller  (1982a) 
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and  Amemiya  (1982)  in  their  atudiea  of  nonlinear  functional  relationahipa . 

2.  A  Measurement  Error  Model  for  Logistic  Regression. 

2.1.  The  Model. 

Our  measurement  error  model  atarta  vith  (1.1),  but  rather  than  observing 
the  p-vector  x^  we  obaerve 


Xi‘ 


where 


(2.1) 


In  (2.1)  E*  la  the  square  root  of  a  symmetric  semi-positive  definite  matrix  E 


scaled  ao  that  |E|  ■  1  and  (e^)  are  independent  and  identically  distributed  ran¬ 


dom  vectors  with  zero  mean  and  identity  covariance;  also  ia  Independent  of 


y^,  i*l,***,n.  The  scale  factor  0  dictates  the  magnitude  of  the  measurement 


error,  e.g.  if  is  a  mean  of  m  independent  replicate  measurements  of  x^  then 


-i 


c  «  m  .  The  asymptotic  theory  presented  in  this  paper  requires  that  o  •*  0  as 
n  i.e.  large  sample,  small-measurement-error  asymptotics.  The  asymptotics 


are  relevant  for  two  situations:  (i)  XA  is  an  average  of  m  independent  measure¬ 


ments  of  x^,  in  which  case  the  Central  Limit  Theorem  suggests  that  (e^)  should 


be  viewed' as  normal  random  variates  and  (ii)  when  measurement  error  is  small  but 
nonnegliglble.  In  the  latter  case  the  moments  of  order  greater  than  two  of  (t^) 


generally  differ  from  those  of  a  normal  variate. 


Our  methods  of  correcting  for  bias  require  knowledge  of  the  error  covari- 
A 


ance  matrix  V  ■  0ZE.  Since  this  information  is  seldom  available  all  asym¬ 
ptotic  results  are  derived  for  the  case  in  vhich  V  is  replaced  by  an  estimator 
V  satisfying 


n*  (V  -  V)  -  0  (a7)  . 

P 


(2.2) 


Condition  (2.2)  is  satisfied,  for  example,  when  V  is  estimated  by  replication. 
It  is  convenient  to  write  V  •  82£  where  8*  *  |V||  and  £  ■  V/(V|;  note 
that  (2.2)  then  implies  n^(l  -  82/c2)  ■  0  (1). 
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2.2  The  Effects  of  Measureaent  Error. 


Our  investigation  starts  with  a  study  of  the  eatiaator  obtained  by  regres- 


>fnS  Y±  on  the  observed  X^.  This  estimator,  to  be  called  ft,  maxiaites 
Ln(r)  -  n  *£  |  yjlog  F(cjr)  +  (I-y^log  F(-c*r)j 

.  ■  r  ■  1 


(2.3) 


and  satisfies 


I  (y±  '  F(cjB))Cl  -  0  , 


(2.4) 


when  ,  i-l,...,n.  Our  interest  lies  in  the  behavior  of  8  as 

aax(o,n  -*  0.  In  addition  to  assumptions  on  the  errors  e^,  some  design  con¬ 
ditions  are  necessary  to  insure  weak  consistency  of  0.  We  shall  work  with  the 
following  assumptions: 

(Cl)  G  (y )  converges  pointwise  to  a  function  G(y)  possessing  a  unique 
n 

maximum  at  0  where  G  (*)  is  defined  as 
o  n 

Gn(r)  -  n  1  X  |  F(x^8o)log  F(x*y)  +  F(-x*Bo>log  F(-x*r)j; 
n 

(C2)  £  (|x.|)>  -  o(n>); 

1  1 

(C3)  E  (|£l»)  <  ®  . 

(Cl)  is  an  assumption  of  convenience  since  for  each  n,  G  (•)  is  concave  with  a 

n 

maximum  at  Bq.  Weaker  conditions  could  thus  be  eaployed  by  studying  subse¬ 
quences  of  G  (*);  see  Theorem  10.9,  Rockafellar  (1970). 
n 

Consistency  of  8  is  proved  in  Theorem  5.1  ;  this  result  is  necessary  to 
establish  the  following  asymptotic  expansion  which  is  crucial  to  our  investi¬ 
gation.  Theorem  1  gives  conditions  such  that  vith  N(o,n)  •  sax(oz,n  ^ ) , 


8-8  +  n~^S_1 ( 8  )Z  +  ozs"1(8  )(J  ,  ♦  J  ,)B  ♦  o  (N(o,n)), 

o  non  no  n,x  n,/  o  p 


(2.5) 


-o'* 


I  (Yj  -  F(xiB0^x1  I 


J  ,  -  -( 2n)_1  l  F(2)(xJb  )xfiTI  ; 
n,l  j  i  o  i  o 


where 
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Theorem  1 ■  (Asymptotic  expansion  of  I).  Assume  that  B  is  a  consistent  esti¬ 
mator  of  B  satisfying  (2.4).  Also  assume: 
o 

(Al)  There  exist  a  positive  definite  matrix  M,  6  >  0,  and  N  <  <*>,  such 

o 

that  S  (7)  i  H  whenever  n  i  N  and  |7~B|£<; 
n  0  0 


(A2)  n  1  J  lx  I*  — >  x*  <  »,  max  Jx  |  -  0(0  2)  ; 

i  ISiSn 

T  Oa*. 

(A3)  E(tj)  -  0.  EUjtJ)-  I,  Edtjl  )  S  B  for  some  a  >  0,  B  <  •». 


Then  B  has  the  expansion  given  in  (2.5). 

Note  that  assumptions  (Al)  and  (A2)  are  sufficient  to  insure  asymptotic 
normality  for  by  an  appeal  to  the  Lindeberg  Central  Limit  Theorem.  Thus 
Theorem  1  indicates  that  with  X  -  n^ff*  we  can  expect  n^(B  -  Bo>  to  be  approxi¬ 
mately  normally  distributed  with  mean  XS  *(B  )(J  ,  +  J  ,)B  and  covari- 

n  o  n,i  n,Z  o 

ance  S  *(B  ),  when  n  is  large  and  0  is  small.  When  X.  is  a  mean  of  m  rep- 
n  ]  o  -  i 

licates,  c*  «  m  *  and  X  describes  the  relationship  between  the  sample  size  and 
the  rate  of  replication.  The  asymptotic  bias  obviously  decreases  with  in¬ 
creasing  replication. 

We  can  use  expansion  (2.5)  to  construct  a  corrected  estimator,  6  ,  which 

c 

has  smaller  asymptotic  bias.  Before  doing  so  we  comment  on  the  problems  with  3 
alluded  to  in  the  introduction. 


-6- 


Blu  and  attenuation.  Consider  simple  logistic  regression  through  the  origin 

with  B  >0.  One  expects  to  see  attenuation,  i.e.,  a  negative  first  order  bias 
o 

term.  For  most  designs  this  is  true.  Somewhat  aurprlaingly  and  completely  at 

odds  with  the  linear  regression  case,  S  ^(B  )(J  ,  ♦  J  „)B  can  be  posi- 

n  o  n ,  l  n ,  i  o 

tlve.  One  design  in  vhich  this  occurs  arises  when  moat  cases  have  very  high  or 

j 

very  low  risk,  i.e.  |x  B  |  is  large  for  most  1. 

1  o 

Hypothesis  Testing.  Consider  a  two-group  analysis  of  covariance,  x^  “ 

(1,  (-1)*,  d . ) ,  B  “  (Brt,  8,,  B_).  The  covariable  d.  is  measured  with  error 
i  o  u  l  i  l 

variance  o2 .  Often  Interest  lies  in  testing  hypotheses  about  the  treatment 
effect  B^.  A  standard  method  to  test  Bj  ■  0  i«  to  compute  its  logistic  regres¬ 
sion  estimate  compared  to  the  usual  estimate  of  its  asymptotic  standard  error. 
When  the  asymptotics  of  Theorem  1  are  relevent  and  n^o*  ->  X  >  0,thia  test  ap¬ 
proaches  its  nominal  level  only  if  the  second  component  of 

S  *(B  )(J  ,  +  J  -)8  approaches  zero.  Letting  s„  denote  the  second  row  of 

non,l  n , 2  o  2 

S  *(8  )  this  is  achieved  only  if 
n  o 

■  n_1  I  62xiF(2)(xiBo)ff2®2  *  °* 

This  will  not  hold  in  the  common  epidemiologic  situation  in  which  the  true  co¬ 
variables  are  not  balanced  across  the  two  treatments.  Thus,  when  substantial 
measurement  error  occurs  in  a  nonrandomized  study,  there  will  be  bias  in  the 
asymptotic  levels  of  the  usual  tests. 

3.  Accounting  for  Measurement  Error. 

In  this  section  three  alternative  approaches  to  estimation  are  studied.  The 
first  is  based  on  expansion  (2.5)  and  is  distribution-free  in  the  sense  that 
only  moment  assumptions  are  made  about  the  measurement  errors.  The  second  two 
methods  are  based  on  an  assumption  of  normally  distributed  errors;  tbeir  asym¬ 
ptotic  properties  are  then  studied  under  more  general  conditions. 
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3.1  Adjusting  for  Bias  in  B ■ 


Write  b  -  S”X(B  )(J  .  +  J  ,)8  and  6  -  s“1(B)(J  .  ♦  J  ,)8 

n  n  o  n,l  n,2  o  n  n  n,l  n,Z 


where 


s  (r)  ■  n"1  2  F(1)(X*y)X,X*;  (3.1) 

n  j  ixi 

J  .  -  -  (an)'1  2  F<2)(xJ|)X  gT£; 

n,l  j  i  i 

J  ,  -  -  n'1  2  f(1)(X*b)£; 

n,2  J  i 

depends  only  on  the  observed  data  end,  under  the  conditions  of  Theorem  1  and 

(2.2),  approximates  b  in  the  sense  that  C  -  b  ■  o  (1)  aa  min(n,  a  1)  -*  •- 
r  n  n  n  p 

This  result  suggests  that  the  bias-corrected  estimator  B  ■  B  -  32o 

c  n 

should  have  smaller  asymptotic  bias  for  large  n  and  small  o.  We  state  these 
results  as  a  theorem. 

Theorem  2.  Assume  the  conditions  of  Theorem  1  and  (2.2).  Then 

B  -  B  +  n"*S_1(B  )Z  +  o  (N(ff, n)). 
co  n  o  n  p 

/ 


Remarks.  In  Section  5,  Theorem  2  is  proved  using  the  following  characterization 

of  B  :  Note  that  B  m  (I  -  d*B  )B  where  B  *  S  1(B)(J  ,+  J  _ ) . 

c  c  n  n  n  n,l  n,2 

T  T  « I 

Since  X, B  *  X. (I  -32 B  )  B  it  follows  that  B  maximizes  (2.3)  when 
line  c 

c.  *  x.  ,  defined  as 
i  i,c 

x ,  -  x,  +  a2 ( x  -  a2!1)-1!^  .  (3.2) 

i,c  i  n  n  i 

In  this  sense  3  is  a  type  of  two-stage  estimator  obtained  by  doing  logistic 
c 

regression  with  x .  replacing  X  . 

i.c  i 

The  estimator  B  is  not  unbiased,  just  less  biased.  The  Monte  Carlo  study 
c 

of  Section  4  shows  that  in  some  realistic  sampling  situations  the  reduction  in 


bias  is  substantial. 
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unlike  linear  regression  in  which  the  errors-in-variables  functional  maximum 

likelihood  estimator  is  consistent  when  V  is  known. 

Our  final  estimator  starts  with  an  assumption  of  normal  errors  and  exploits 

the  consequences  of  sufficiency.  Given  o2£  and  $  ,  a  sufficient  statistic  for 

o 

estimating  x.  is  c.(B  )  »  X.  +  <JZ( y.  -  4)lB  ;  it  follows  that  the  distribution 
1  1  o  1  1  o 

of  y.  given  c,(B  )  does  not  depend  on  x  .  The  reason  for  using  this  parti- 
i  i  o  i 

cular  sufficient  statistic  is  that 

P{y.  -  lie, (6  )}  -  F(c*(B  )B  )  (3.4) 

1  i  O  lOO 

and  hence  the  score  equation 

n  — T  - 

I  (y,  -  F(c‘(B)P))c  (B)  -  0  (3.5) 

1 

is  unbiased  for  B  .  Equation  (3.5)  can  have  multiple  solutions  not  all  which 
o 

produce  a  consistent  sequence  of  estimators.  Since  c^(B)  also  depends  on  the 
unknown  matrix  a*Z  we  propose  the  following  modif ication:  Let 

/  *i  s  “  Xi  +  ^yi  ~  (3.6) 

and  define  B  .  the  sufficiency  estimator,  as  the  maximizer  of  (2.3)  when  c  is 
s  l 

replaced  by  x  .  This  estimator  is  consistent  under  (Cl)  -  (C3)  and  (2.2)  and 

1,8 

haB  the  expansion  given  in  the  next  theorem. 

Theorem  4.  Assume  the  conditions  of  Theorem  1  and  (2.2).  Then 

B  -  B  +  n~*S-1(B  )Z  +  o  (N(ff,n)). 
so  n  o  n  p 


Remarks.  1.  Theorem  4  does  not  require  the  assumption  of  normal  measurement 
error.  Also,  B  can  be  replaced  by  any  consistent  estimator  in  the  definition  of 


x,  .  The  effects  of  nonnormal  measurement  error  and  our  particular  choice  of 

i,s 

x  become  apparent  only  when  B  is  expanded- through  terms  of  order  NI(o,n). 

1,8  8 


This  analysis  is  lengthy  and  is  not  presented  here. 
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3.2  Normal  Measurement  Error. 

When  measurement  error  is  present  there  is  an  added  source  of  variation 

which  is  not  accounted  for  by  model  (1.1).  Ve  now  expand  this  model  by  assuming 

that  (c^)  are  normally  distributed,  an  aasus^>tion  which  is  not  unreasonable  in 

some  situations.  The  functional  log-likelihood  for  estimating  8  and  x,  ,...,x 

o  1  n 

is  then 


1  |y.log(F(x|B))+(l-y1)log(F(-x^B))-(2oz)'1(Xi-  x±) V^X^) j. 


(3.3) 


The  vectors  B^,  maximizing  (3.3)  satisfy 

D  T- 

X  (yA  -  F(£jBf)c1  -  0  ; 

Ei  "  Xi  +  <yi  ~  r<£iV>0,I*f  1  "  1 . n* 

There  are  two  problems  with  this  estimator;  it  depends  on  the  unknown  matrix  o2Z 
and  solving  for  B^  and  (c^)  is  difficult.  For  these  reasons  we  suggest  a 
modified  version  of  B^.  Noting  the  form  of  we  let 

if  f  *  X1  +  (y±  -  F(x|b))32£b  (3.4) 


and  define  B,  as  the  estimator  obtained  by  maximizing  (2.3)  with  c  *  x 

X  1  X  f  x 

B^  is  consistent  under  (Cl)  -  (C3)  and  (2.2)  and  has  an  asymptotic  expansion 
given  in  the  next  theorem. 


Theorem  3.  Assume  the  conditions  of  Theorem  1  and  (2.2).  Then 


B  +  n~*S_1(B  )Z  +  oz  S_1 ( B  )J  .B  +  o  (N(o,n)). 
o  non  non,lo  p 


Remarks .  The  functional  maximum  likelihood  estimator,  like  B,  has  a  first 
order  bias.  The  bias  term  is  not  due  to  our  one-step  modification  nor  to  V; 
this  fact  is  evident  from  the  proof  of  Theorem  3.2.  Logistic  regression  is  thus 
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2.  It  la  possible  to  define  a  sufficiency  estiaator  for  a  large  class  of 
aeasureaent  error  nodela.  In  particular  we  have  in  Bind  the  generalized  linear 
models  with  canonical  link  functions  (McCullagh  and  Nelder,  1983).  A  complete 
exposition  of  this  theory  will  appear  elsewhere. 

In  the  next  section  results  from  a  small  Monte  Carlo  study  are  presented. 

4.  Monte  Carlo 

We  conducted  a  small  simulation  experiment  to  determine  the  relative  merits 
of  the  four  estimators  8,6,  B  ,,  and  8  . 

C  X  3 

The  model  for  the  study  was 

Pr{yA  ■  1 1 dA>  -  e  +  Bd^  i-l,...,n.  (4.1) 

We  considered  these  sampling  situations  where  Xj  denotes  a  chi-squared  random 
variable  with  one  degree  of  freedom: 

(I)  (a,B)  •  (-1.4,1. 4),  (d  )  -  Normal  (0,o*  *  .10),  n  -  300,  600; 

1  a 

(II) (a,B)  -  (-1.4, 1.4),  (d±)  -  ad(X*  -  DIV2  ,  o^  -  .10,  n  -  300  ,  600; 

For' both  cases,  the  measurement  error  variance  T*  was  one  third  the  vari¬ 
ance  of  the  true  predictors  (t*  ■  o*/3).  For  each  case,  we  considered  two 

d 

sampling  distributions  for  the  measurement  errors  (e^):  (a)  Normal(0,x*)  and 
(b)  a  contaminated  normal  distribution,  which  is  Normal(0,r* )  with  probability 
0.90  and  Normal(0,25x* )  with  probability  0.10. 

We  believe  these  two  sampling  situations  are  realistic,  but  their  represen¬ 
tativeness  is  limited  by  the  size  of  the  study.  The  sample  sizes  n  ■  300,  600 
may  seem  large,  but  our  primary  Interest  is  in  larger  epidemiologic  studies 
where  such  sample  sizes  are  common.  For  example,  Clark  (1982)  was  motivated  by 
a  study  with  n  *  2580,  Hauck  (1983)  quotes  a  partially  completed  study  with 
n  2  340,  and  we  have  analyzed  Framingham  data  .for  males  aged  45-54  with  n  *  589. 
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Furthermore,  the  results  of  the  study  suggest  that  correcting  for  measurement 

error  In  most  small  sample  situations  is  unwarranted. 

The  values  of  the  predictor  variance  0*  and  the  measurement  error  varl- 

d 

ance  t*  are  similar  to  those  found  in  the  Framingham  cohort  mentioned  in  the 

previous  paragraph  when  the  predictor  was  log  {(systolic  blood  pressure-75) /i\, 

e 

a  standard  transformation.  The  ratio  i*/©*  ■  1/3  is  not  uncommon;  Clark 

d 

finds  a  similar  ratio  in  her  study  of  triglyceride.  The  choice  of  (a,B)  comes 
from  Framingham  data  as  well.  All  experiments  were  repeated  100  times. 

In  each  experiment,  we  sampled  two  independent  measurements  (D  . ,  D  .)  of 

X  t  1 

-  T  - 

each  d^;  the  observed  covariate  was  X^  ■  (1,  D^)  ,  where  »  (D^  ^  * 

D  .)/2.  The  matrix  02E  has  only  one  non-zero  entry  which  was  estimated  by  the 
x  *  *■ 

samplo  variance  of  (D^  ^  ^)!2. 

In  addition  to  the  four  estimators  presented  in  this  paper  we  included  in 
the  study  a  proposal  due  to  Clark  (1982).  She  suggests  the  estimator  Bj,  ob¬ 
tained  by  maximizing  (2.3)  tfhen  c^  is  replaced  by  N  " 

where  p  and  £  are  the  sample  mean  and  covariance  of  the  observed  data.  Moti- 
/  A 

ration  for  this  estimator  derives  from  an  assumption  of  normal  errors  and  normal 

covariates.  In  this  case  E(x. |X.)  »  X,  -  o*EE  *(X.  -  p)  and  hence  x.  is 

ill  Xi  x,n 

a  natural  estimator  of  x^.  Theorems  5.1  and  5.2  can  be  used  to  prove  consis- 
tency  and  derive  an  asymptotic  expansion  for  this  estimator.  Like  8  and  B^., 

8  has  a  non-zero  first  order  bias  although  it  is  too  lengthy  to  present  here. 

(I 

Sweeping  conclusions  cannot  be  made  from  such  a  small  study.  However,  we 

at 

can  make  the  following  qualitative  suggestions.  First  B  is  less  variable  but 
more  biased  than  the  others;  sample  sizes  such  as  n  »  600  as  in  the  study  or 
Clark's  n  ■  2580  are  such  that  bias  dominates  and  hence  are  candidates  for  using 
corrected  estimators;  an  opposite  conclusion  holds  for  small  sample  sizes  where 
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variance  dominates.  A  second  suggestion  from  the  tables  is  that  when  I  loses 
efficiency  (Case  1(b),  11(b)  and  when  n  ■  600),  the  corrected  estimators  perform 
quite  well. 

Both  B^and  were  defined  via  an  assuaptlon  of  normal  errors  yet  they 

also  performed  well  when  the  errors  were  contaminated  normal,  (Cases  1(b), 

11(b)).  Clark's  estimator  proved  to  be  sensitive  to  the  assumption  of  normal 

covariates;  §  performed  very  well  in  our  study  when  the  predictors  were 

normally  distributed,  but  it  did  have  a  noticeable  drop  in  efficiency  when  the 

predictors  were  highly  skewed  (Case  II).  Finally,  the  corrected  estimator  8  , 

c 

which  was  derived  with  no  distributional  assumptions  for  either  the  predictors 
or  errors,  performed  well  throughout  the  study. 

Sk  A 

In  summary,  the  Monte  Carlo  results  suggest  that  the  estimators  6  ,  6,, 

c  [ 

8  and  Clark's  B„  are  useful  alternatives  to  B  when  covariates  are  measured 

a  N 

with  error.  The  pressing  practical  problem  now  appears  to  be  to  delineate  those 
situations  in  which  ordinary  logistic  regression  should  be  corrected  for  Its 

bias.  Studies  of  inference  and  more  detailed  comparisons  of  alternative  esti- 

/ 

mators  will  be  enhanced  by  the  identification  of  those  problems  where  measure¬ 
ment  error  severely  affects  the  usual  estimation  and  inference. 

5.  Prop's  of  Theorems 

Consider  the  estimator  8  obtained  by  maximizing  (2.3)  when  c^  la  replaced 
with  x^  where 

5i  “  Xi  *  0vi  *  ®in'  (5*U 

In  Theorem  5.1  we  prove  weak  consistency  of  B  under  conditions  (Cl),  (C2),  (C3) 

and 

(W)  I  |ginl*  -  0p(n)  . 

In  Theorem  5.2  an  asymptotic  expansion  for  B  is  given.  The  consistency  and 
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asymptotic  expansion*  of  B ,  B  ,  B^,  and  B#  follow  from  these  general  results 
by  noting  that  X . ,  x .  ,  x.  and  x.  all  have  the  representaton  given 

1  1  ,  C  1 1 1  1,8 

In  (5.1).  Ve  reaind  the  reader  that  all  the  asymptotic  expressions  hold  as 
max(  a  ,n  *  )  *+  0. 

Theorem  5.1  (Consistency).  Assume  (Cl),  (C2),  (C3),  and  (PI);  then  B  -B  ■  o  (1). 
-  o  p 

Proof.  Define  L  (t)  to  be  the  function  obtained  by  taking  c.  ■  Sc.  in  (2.3). 

1  n  lx 

The  identity  log(F( t) /( 1-F( t)) )  •  t  is  used  to  show  L  (r)  -G  (y)  ■  R  .  ♦  P  , 

n  n  n ,  1  n ,  / 

where 


.  *i  _  _ 

R  .  -  n  1  2  ( y .  -  F(x  B  ))x  y  ; 
nf  1  |  1  loi 

Rn  2  *  n  1  2  |yi^i7  "  xir^  +  iog  F(-x|r)|  . 

Under  (C2),  R  ,  has  mean  zero  and  asymptotically  negligible  variance; 
n,l 

also  by  (C3)  and  (PI), 

|Rn  2I  S  2o|y|n*1  2  lv±  ♦  og^l  -  op(l)  . 

i  1 


Consequently  (Cl)  implies  that  Ln( • )  converges  pointvlse  in  probability  to 

G(  * )  •  An  appeal  to  Theorem  II.  1  of  Anderson  and  Gill  (1982)  concludes  the  proof. 

The  consistency  results  follow  by  applying  Theorem  5.1  first  to  B , 

(g  «  0)  and  then  to  B  ,  Bf,  and  B  .  Next  we  derive  the  asymptotic  expansions  for 
in  Cl  s 

these  estimators. 

Theorem  5.2  (Asymptotic  expansion).  Assume  (PI)  and  the  conditions  of  Theorem 
1 ;  then 


B  -  B  +  n"*S-1(B  )2  ♦  ff?s'1(B  )((J  .+  J  „)B  +  b  ,  +  b  +  o  (N(o,n)), 

o  non  n  o  (  n,l  n,2  o  n,3  n,4)  p 

b„.3  ‘  n'‘  f  <yi  - 


where 
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■>„,*  -  5  '“XvvL*. 


S  (*)  Is  given  In  (1.2),  and  Z  ,  J  ,  ,  and  J  -  are  defined  in  (2.5). 
n  n  n, l  n,* 

Theorem  5.2  la  proved  with  a  aeries  of  lemmas.  First  ve  show  how  Theorems 

1-4  follow  as  corollaries.  Theorem  1  is  immediate  since  g.  ■  0  for  B.  For 

in 

B  ,  g.  (0*/o*)( I  ~  3*BT)“1BTX,  where  B  -  §'1(B)(J  J 

c  in  n  n  i  n  n  n,l  n,2 

Assumptions  (A2),  (A3),  Lemma  5.1,  and  (2.2)  imply  b  ,  •  o  (1)  and 

n,  J  p 

-b  .  -  n*1  I  F<1)(xTb  )x.x!b  (I  -  3lB  )-1B 
n,4  j  ioiin  n  o 

-  S  (B  )B  B  *  o  (1) 

n  o  n  o  p 

-  (J  .  ♦  J  _)B  ♦  o  (1), 

n,  1  n,2  o  p 


thus  proving  Theorem  2. 


For  Bj,  g^n  •  (3z/o1)(yi  -  F(X^B))£S  and  (A2),  (A3),  Lemma  5.1,  and 


(2.2)  imply  b  -  o  (1)  and 
n,4  p 


n, 3  ■  "■1  }  <’t  -  *  VU 


-  *  Jn  O*  +  °n(1)  5 

n,z  o  p 

Theorem  3  follows.  Finally  for  Bs,  gln  -  ld*/9*)ly  -  })£§.  (A2),  (A3), 

Lemma  5.1  and  (2.2)  imply 

b„.3  • n’1  £  <yi  -  F<*IBo))<J,r  *)I8o ♦  V1* 

’  -  Jn.28o  *  VUi 

bn,4  "  "  "  '  |  r<1)<II8oKjrl  '  *>V«18.  *  °p0> 


-  -  n"1  2  F(1)(xJb  )(F(x?B  )  -  4)x,BTEB  ♦  o  (1) 
J  io  io  loo  p 
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-  -  J  .0  ♦  o  (1). 
n,  1  o  p 

In  the  lest  step  we  use  the  identity  F^^(t)  ■  F^^(t)(l  -  2F(t)).  This  proves 

Theorem  4.  Notice  that  in  deriving  these  results  we  used  only  the  fact  that 

0-0  -  o  (1)  thus  the  conclusions  of  Theorems  3  and  4  remain  unchanged  if  0 
o  p 

is  replaced  by  any  other  consistent  estimator  in  the  definitions  of  z  and 

If* 

x  .  In  particular  this  Implies  that  the  fully  iterated  versions  of  the 
it* 

functional  and  sufficiency  estimators  (provided  consistent  versions  are  chosen) 


also  satisfy  Theorems  3  and  4  respectively. 

The  proof  of  Theorem  5.2  starts  with  the  following  weak  lav. 

Lemma  5.1.  Let  ,  u ...  be  independent  random  vectors  such  that  E(u^)  •  0 
_  n 

and  EClu.I  +°)  £  B  for  some  a  >  0  and  B  <  •.  If  £  |a  |  ■  O(n)  and 
1  _  1  ^ 


max  (|a  |/n)  -  o(l)  then  n  J  a.u  -  o  (1). 
l£i£n  1  1  P 


Proof.  The  proof  of  the  lemma  entails  a  routine  verification  of  the  assumptions 
of  Theorem  5.23,  Chung,  (1974)  and  is  not  given  here. 


Lemma  5.2.  Under  the  conditions  of  Theorem  1, 


n'1  I  (y±  -  F(xJbo))X1 


-  n“*Z  +  o*( J  _  +  J  ,)B  +  o  (N(o,n)). 

n  n,l  n,2  o  p 


Proof,  n*1  T  (y.  -  F(X*B  )X.  -  T  .  ♦  T  ,  where 
-  Y  x  x  o  l  n, i  n,x 


rn,l  “  f(yi  ‘  F(XIPo))xi 


Tn,2  -  ffn”1  l  (y±  -  FW^))^. 


A  Taylor  series  expansion  of  F(*)  shows  that 


T  ,»n^Z  ♦  o* J  ,0+  n”^Q  ,  ♦  e*(D  ,  +  R  ,) 

n, 1  n  n, 1  o  n,l,o  n,l  n,l 


where 


Q  ,  -  -on  J  l  F '  '(x  B  )v  B  x  ; 

" ,  1 ,  o  f  i  o  i  o  i 


Dn>1  -  -Unf1  r  <r(2)(«TBoK(v^o)>  - 


'n.l  "  "(2'°"1  Z  <'<2><*X>  '  F''”<Ii,o))<vi,o>'iti  ! 

and  is  on  the  line  segment  Joining  x^  to  X^.  ^n,l,o  ^a*  “ean  Eero  an<* 
asymptotically  negligible  variance  thus  n  ^Qn>1>0  *  °p(n  Assumptions  (A2) 

and  (A3)  and  Lemma  5.1  are  used  to  show  Dn  j  "  °p(l)»  A1®0  note  that 

|Rn  ,|  S  (2n)‘1  I  |,1l<v2Bo)>.ln(l..|vjB0l)  S  Vo 


(2)/_T4.  »w  T. 


where 


An  "  (n‘1  |  |xi|,<Vi0o),)i; 


Bn  -  (n_1  X  (v^Bo)**in*(l,e|vjBol))i. 

/ 

Assumptions  (A2)  and  (A3)  and  Lemma  5.1  imply  Ar  -  0p(l)  while  (A3)  and  the  fact 
that  max(n_1,o)  0  imply  B  -  o  (1).  It  follows  that  **(I>n>1  +  *n>1>  -  *»„(•*> 


n  p 


Combining  these  results  we  get 


T  ,  -  n~^Z  ♦  o1 J  ,B  ♦  o  (N(o,n)). 
n,l  n  n, 1  o  p 

Another  Taylor  series  expansion  of  F( • )  shows  that 


(5.2) 


where 


!J  .0  ♦  n”^Q  -  _  ♦  ff2(D  ,  ♦  R  .) 

n,2  o  n,2vo  nt2  nv2 


X  (y±  -  F<*Iv)vi; 


J  -  on  7  X  (y±  -  F(; 

n,2,o  j  1 


-o'1  I  F(1)(x’.<>)(v1.J  -  I)«o: 


a^ j  a  f  ,  *j»M  v  * 
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Rn  2  “  “n"J  ^  *  F<1>(xi®o)>VlVieo  5 

and  X .  Ilea  on  the  line  segment  joining  x .  to  X . .  Q  -  .  D  and  R  _  are 
1  li  n »x,o  nti  n«4 

all  o  (1);  the  proof a  are  analogous  to  thoae  for  Q  ,  .  D  , ,  and  R  .  reapec- 

p  n,i,o  n9 1  n  f i 


tively.  Conaequently 


T  -  «  «*J  -B  +  o  (N(o,n)). 
n,2  n,2  o  p 

Combining  (5.2)  and  (5.3)  completea  the  proof  of  the  lemma. 

Lemma  5.3.  Aasume  the  conditions  of  Theorem  1  and  (PI)  and  define 
— ln  T 

II  (y)  -  n  J(y.  -  P(x.t))x.;  then 
n  j  l 


H  (B  )  -  n~^2  +  o*((J  _  ♦  J  «)B  +  b  ,  ♦  b  .)  +  o  (N(o,n)), 

no  n  n,  1  n,2  o  n,3  n,4  p 


(5.3) 


Proof.  H  (B  )  -  V  ,  ♦  W  ,+W_,+W_ 
-  n  o  n,l  n,2  n,3  n,< 


where  Wn,l  "  ^  <Yi  “  F(Xi*o))Xi  5 

j 

W„,2  *  •"'*  f  <F<XX>  -  F<SI,o»<''i  *  *«!»>- 

W„.3  '  f  <F1  -  F<XI*o»‘in  ' 

W„,4  *  »'1  f  <F<XI««,>  -  F(5X),X1  • 

o 

Note  that  in  light  of  (A2)  and  (PI)  |W  «|  £  o*n  £  |g.  |(|v  |  ♦  e|g  |)  « 

n  t  £  ~  m  l  xn 

V**’-  ,W..3  -  c* bn , 3^  S  f  |F<5'X)  -  F<J£„>"«i„«  S 

Ho|.*n'1  I  Ivjllgj,,!  S  l«c».>(n'1  Z  Iv1l>)i(n"1  Z  1*^1* >*  -  «„<•'). 

uaing  (A3)  and  (PI).  One  term  ln  a  Taylor  sqries  expansion  of  F(*)  and  Lemma 
5.1,  (A2),  and  (PI)  show  that 
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»«,.*  -  s  X  civji  *  .'i«1„i)ix1n«lni 

S  I  l»1ll«1ll«1„l  *  o'"'1  X  ■*4II«1J>I*  } 

S  o*IBo|,|o(n"1  Z  Bv1l*lxil*)^(n"1Z  l«inl*)^ 

e2  (  Bax  |x  |)n  *£  Igj  I*} 

ISiSn  1  1  > 

■  o  (a2)  . 

P 

An  expansion  for  W  is  given  in  Lemma  5.2.  Combining  the  above  results  proves 
n,l 

the  Lemma. 

Lemma  5.4.  Assume  PI  and  the  conditions  of  Theorem  1,  then 

B  -  B  ■  0  (N(tf ,n) ) . 
o  p 

Proof.  Let  H  (*)  be  the  function  defined  in  Lemma  5.3.  Consider  the 
-  n 

—X  «. 

real-valued  function  of  r  defined  as  3  (y)  “  E  (y)(B  -  B  ).  The  Mean  Value 

n  n  o 

Theorem  proves  the  existence  of  some  8  on  the  line  segment  joining  B  to  Bq 
such  that 

HT(B  )(B  -  6  )  -  (I  -  8  >TS  (B)(B  -  B  ), 
no  o  on  o 

-1  n  (\)  T  T 

where  §  (r)  -  n  1  I  FW '(xjr)*, x.  . 

Tl  ^  XXI 

It  follows  that  |B  -  B  I  £  |H  (8  )|  X  *  (S  (B))  where  X  (A)  -  minimum 

o  no  min  n  min 

eigen  value  of  A.  Under  (A 2),  (A3),  and  (PI),  S  (B)  -  S  (I)  ■  o  (1)  hence 

n  n  p 

by  (Al),  P{X”J  (S  (B))  S  X"J  (M)}  -»  1;  thus  |B  -  B  I  and  |H  (B  )| 
min  n  min  o  no 

have  the  same  order  which,  from  Lemma  5.3,  is  0  (N(o,n)). 

P 


*.*  %■  s'  s'  * 


.•  '/  V  *.*  \ 


We  are  now  in  a  position  to  prove  Theorea  6.2. 

—1  n  T- 

Proof  of  Theorem  6.2.  By  definition  n  £  (y.  -F(R.B))5c.  •  0;  ex- 

1  - 

pending  F(*)  in  a  Taylor  series  shows  that  S( B  -  Bq)  "  v^ere 

-  -1«  T-  ..  _T 

S  -  n  J  F  (xiBi>xixi 

and  for  each  1,  |B.  -B8£|B-B|.  (A2),  (A3),  (PI),  and  the  conclusion 

1  o  o 

of  Lemma  5.4  are  used  to  show  S  -  S  (B  )  ■  o  (1).  The  Theorea  follows  from 

no  p 

Lemma  5.4. 

Acknowledgements .  This  research  was  supported  by  the  Air  Force  Office  of  Sci¬ 
entific  Research  under  grant  AFOSR-80-0080.  Ve  thank  Rob  Abbott  for  suggesting 
the  problem  and  the  referees  for  useful  comments  on  an  earlier  draft  of  this 
paper. 

j 


TABLES 


These  are  the  results  of  the  Monte-Carlo  study.  "Effici¬ 
ency"  refers  to  mean  squared  error  efficiency  with  respect 
to  ordinary  logistic  regression. 
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CASE  1(a) 


h 


i 

(a,  B)  - 

(-1.4,  1.4), 

(dt)  -  N(0,c^ 

-  .1).  (ct) 

-  N(0,o* 

"  fld/3)* 

• 

*. 

• 

a 

e 

a 

*c 

8 

s 

«* 

a 

m 

i 

n  m  300  Bias 

-0.21 

-0.04 

-0.05 

-0.02 

-0.06 

I 

Std.  Dev. 

0.52 

0.61 

0.61 

0.61 

0.60 

V 

Efficiency 

100%* 

85% 

85% 

84% 

88% 

! 

n  -  600  Bias 

-0.22 

-0.05 

-0.05 

-0.02 

-0.06 

-' 

Std.  Dev. 

0.33 

0.38 

0.38 

0.38 

0.38 

i 

Efficiency 

100%* 

108% 

106% 

107% 

108% 

CASE  1(b) 

p 

Case  1(a)  but 

i 

Sane  as 

aesaureaent 

errors 

•; 

have  the 

contaninated  normal  distribution. 

1 

l 

n  *  300  Bias 

-0.49 

-0.16 

-0.19 

0.02 

-0.20 

w 

Std.  Dev. 

0.34 

0.48 

0.48 

0.54 

0.46 

*• 

Efficiency 

100%* 

143% 

139% 

121% 

143% 

►  • 

B 

n  -  600  Bias 

-0.53 

-0.20 

-0.21 

-0.03 

-0.22 

►  t 
»' 

Std.  Dev. 

0.24 

0.33 

0.34 

0.38 

0.33 

Efficiency 

100%* 

223% 

215% 

234% 

216% 

By  definition. 
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CASE  11(a) 


(o,  B)  •  (-1.4, 

1.4),  (dA) 

'  °d<xi 

-1 )/V2,  o*  -  0.1, 

(c±)  ~ 

N(0,o*  -  o* 

B 

B 

c 

gf 

gH 

B 

8 

n  “  300  Bias 

-0.28 

-0.05 

-0.07 

0.10 

-0.08 

Std.  Dev. 

0.47 

0.58 

0.57 

0.66 

0.56 

Efficiency 

* 

100% 

90% 

91% 

69% 

93% 

n  m  600  Biaa 

-0.27 

-0.03 

-0.04 

0.11 

-0.05 

Std.  Dev. 

0.33 

0.41 

0.41 

0.45 

0.40 

Efficiency 

★ 

100% 

111% 

110% 

85% 

112% 

CASE 

11(b) 

i 

Sane  as 
have  the 

Case  11(a)  but  measurement  errors 
contaminated  normal  distribution. 

n  “  300  Bias 

-0.43 

-0.13 

-0.15 

0.12 

-0.17 

Std.  Dev. 

0.33 

0.44 

0.45 

0.53 

0.43 

Efficiency 

.  * 
100% 

141% 

134% 

103% 

141% 

n  *  600  Bias 

-0.46 

-0.15 

-0.16 

0.10 

-0.18 

Std.  Dev. 

0.25 

0.33 

0.34 

0.40 

0.33 

Efficiency 

100%* 

201% 

190% 

159% 

194% 

*  By  definition. 
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